A bilateral filtering-based image enhancement for Alzheimer disease classification using CNN

This study aims to develop an optimally performing convolutional neural network to classify Alzheimer’s disease into mild cognitive impairment, normal controls, or Alzheimer’s disease classes using a magnetic resonance imaging dataset. To achieve this, we focused the study on addressing the challenge of image noise, which impacts the performance of deep learning models. The study introduced a scheme for enhancing images to improve the quality of the datasets. Specifically, an image enhancement algorithm based on histogram equalization and bilateral filtering techniques was deployed to reduce noise and enhance the quality of the images. Subsequently, a convolutional neural network model comprising four convolutional layers and two hidden layers was devised for classifying Alzheimer’s disease into three (3) distinct categories, namely mild cognitive impairment, Alzheimer’s disease, and normal controls. The model was trained and evaluated using a 10-fold cross-validation sampling approach with a learning rate of 0.001 and 200 training epochs at each instance. The proposed model yielded notable results, such as an accuracy of 93.45% and an area under the curve value of 0.99 when trained on the three classes. The model further showed superior results on binary classification compared with existing methods. The model recorded 94.39%, 94.92%, and 95.62% accuracies for Alzheimer’s disease versus normal controls, Alzheimer’s disease versus mild cognitive impairment, and mild cognitive impairment versus normal controls classes, respectively.


Introduction
Alzheimer's disease (AD) is a neurological condition contributing to the progressive degeneration of cognitive abilities, including memory, visual-spatial perception, and alterations in personality and behavior.This ailment is commonly linked to the geriatric population, although it can impact individuals of all ages.It is postulated that a global population exceeding 1.5 million individuals exhibits indications of this particular ailment, with a probable escalation anticipated within the forthcoming two to three decades [1].The absence of a viable remedy for AD underscores the importance of timely identification to manage the condition and forestall further deterioration effectively.Individuals in the initial stages of the disease, referred to as mild cognitive impairment (MCI), primarily manifest indications such as memory impairment or diminished decision-making abilities, which may subsequently deteriorate into AD.Detecting MCI early enough would be beneficial for clinicians in preventing further progression [2,3].
Brain imaging is a known non-invasive technique for detecting diseases through the use of brain scans in the form of magnetic resonance imaging (MRI), computed tomography (CT), or positron emission tomography (PET) [4,5].Brain imaging has substantially contributed to the timely diagnosis of AD and detecting the transformation of MCI into AD using a deep learning methodology.Various alternative deep-learning methods have been successfully implemented to accurately categorize AD, including those of Y. Zhang et al. [6], Zhang et al. [7].
Despite these notable achievements, using MRI imaging to diagnose AD or predict its conversion using deep learning is still challenging.One of the key obstacles is the insufficiency of appropriate and uncontaminated data, which adversely impacts the training of deep learning models and consequently undermines their accuracy and generalizability [8][9][10][11][12].As the data is collected over time, there is a tendency for the datasets not to be enough for training models, and again, due to the varying image scanning protocols, imaging artefacts, and moving artefacts being used during the data collection, MRI images are highly susceptible to noise.Image quality can also vary due to the different imaging equipment being used.To address these challenges, it requires developing more robust deep learning models that are more tolerant to noisy images [13,14] or applying a data preprocessing approach to addressing image quality [15][16][17][18].
The motivation of our study is to propose a more reliable approach to improving image quality with ease for classifying AD with improved accuracy.As such, the research devised a proficient strategy for improving image quality by utilizing bilateral filtering and histogram equalization techniques.The proposed method aimed to reduce noise and enhance brightness in images, thereby improving the quality of images used in categorizing AD.The primary contribution of this study comprises two key components: • First, an image enhancement technique that effectively reduces noise in MRI images while preserving image quality was designed and implemented.The image enhancement algorithm for color images was developed using two image processing techniques: image equalization and bilateral filtering.The algorithm mainly improves the quality of the images while maintaining their edges and details.The bilateral filtering was used to keep the edges and details of the image, while histogram equalization modifies the intensities of the image to improve its contrast.
• Second, a CNN model consisting of four convolutional layers and two hidden layers successfully classifies AD more accurately than existing deep learning models.The model was trained using the k-fold cross-validation technique, ensuring that all the images were effectively used in training and testing the model.
The remaining sections of the paper are the literature review, materials and methods, results, and discussion.The literature review section presents a review of some existing stateof-the-art studies.A description of the dataset, the proposed method, and a detailed description of the experiment are presented under the materials and methods section.The results section reveals the study's experimental results, which are further discussed in the discussion section.

Literature review
This section analyzes prior studies that employed MRI images to construct machine learning and deep learning models to classify AD.A particular emphasis was placed on methods that address data inadequacy and image quality challenges.
Ebrahimi et al. [19] designed a pre-trained ResNet in their deep sequencing model to classify AD.The prognostication of the AD recorded an accuracy of approximately 91.78%, signifying a 10% enhancement compared to other CNNs.Kang et al. [20] developed a deep ensemble 2D CNN architecture incorporating multiple models and slides.The model's accuracy was 90.36% on normal controls (NC)/AD, 77.19% on AD/MCI classes, and 72.36% on NC/MCI class labels.
Alinsaif and Lang [21] designed a methodology that integrated 3D shearlet-based descriptors to decrease the dimensionality of MRI datasets.The method showed a noteworthy degree of efficacy in identifying AD.Mahendran and P. M. [22] developed a framework to detect AD that achieved an accuracy of 88.7% using a limited number of data records.
Zhang et al. [23] proposed a multi-modal model to facilitate timely AD detection.The proposed method showed superior performance in diagnosing AD, as evidenced by its training on two image modalities, namely PET and MRI, obtained from ADNI.
Spasov et al. [24] deployed a parameter-efficient model to check the conversion of MCI to AD.The approach combines various data types, including MRI images, demographic information, neuropsychological data, and genetic data.The method showed adaptability and has the potential to incorporate additional imaging techniques, including PET images and various collections of medical information.
In their scholarly work, Sharma et al. [25] presented a model capable of extracting all-level features.The model, FDN-ADNet, designed for early AD diagnosis using MRI images, yielded encouraging outcomes.Abrol et al. [26] applied ResNet in analyzing medical images to investigate the conversion of MCI to AD.Initially, the deep models were trained solely on MCI individuals for prediction purposes.Subsequently, a domain transfer learning version was employed, which involved additional training on AD and NC.The frameworks demonstrated significant prominence in the use of the model.The authors of [27] tackled the issue of insufficient data and missing data in diagnosing AD.The accuracy of all four classes of subjects used was observed to have improved by up to 3% when both complete and incomplete samples were considered during the model generation and testing phases.
In their study, Liu et.al. [28] presented a CNN-based approach for detecting AD.The model was specifically designed to train on a limited quantity of MRIs and accurately identify cases of AD.The model's accuracy was approximately 78.02%, and it was observed that its portability was enhanced as an added advantage.Janghel and Rathore [29] applied a distinctive approach to improve the efficiency of Convolutional Neural Network (CNN) models.They implemented a preprocessing technique on an image dataset before submitting it to the CNN architecture for feature extraction.The experimental findings indicate that an fMRI dataset yielded a precision of 99.95%, whereas the PET data stood at 73.46%.
In their study, Bae et.al. [30] employed transfer learning to forecast the progression of MCI to AD dementia using a 3-D CNN.An 82.4% accuracy recorded for the target task surpassed the performance of existing models in the respective domain.Sathiyamoorthi et al. [31] proposed a method that exploits an adaptive histogram adjustment image correction algorithm to improve the quality of the image in terms of brightness and contrast.The algorithm was applied to segment the interested AD regions in an adaptive and modified manner.The diverse characteristics were computed using a second-order Gray Level Co-Occurrence Matrix.The method classified diseased images and their respective stages based on distinctive features.The empirical findings demonstrate that the suggested approach yields superior levels of precision and efficacy compared to the current system.
Ferri et al. [32] proposed a classification system with LORETA source estimation derived from rsEEG and sMRI variables.The study implemented two artificial neural networks (ANNs) to classify participants as having AD or being part of the control group.The results indicated classification accuracy of 83%, and 86% for the two ANNs, respectively.
In their study, Tong et al. [33] developed a CNN-Sparse Coding Network (CNN-SCN) architecture to detect MCI before it converts.The empirical findings indicate that the model exhibits good stability, accuracy, and generalization.The experiment recorded an accuracy rate of 92.6% in the AD/CN class, while the AD/MCI and MCI/CN classes recorded an accuracy of 74.9% and 76.3%, respectively.
Hridhee et al. [34] investigated and proposed an early detection 2D CNN model for classifying AD using MRI data samples.The researchers applied various image processing and augmentation strategies to the dataset.Subsequently, they trained the data on a VGG16, Xception, and a custom model.The customized model achieved the best performance of the selected models with an accuracy of 94.77%.
Allada et al. [35] aimed to enhance the accuracy of classifying AD and developed a swarm multi-verse algorithm to optimize a deep neuro-fuzzy network for AD classification at various stages.Their study incorporated techniques such as median filtering for preprocessing to improve image quality, a channel-wise feature pyramid network for image segmentation, and CNN for feature extraction.Their model achieved an accuracy of 89.9%, assessed based on kfold values.
Gowhar et al. [36] introduced a deep learning framework centered on CNN for early AD detection.They implemented various data preprocessing and augmentation techniques on the dataset.The classifier was constructed through feature extraction, reduction, and classification.Their proposed model surpassed existing models, achieving an accuracy of 96.22%.
Tufail et al. [37] developed an early-stage AD classifier using 2D and 3D CNN with PET images.Their study differentiated between binary and multiclass categories of AD.Experimental results highlighted superior performance with an accuracy of 89.21% for the AD/NC binary classification using the 3D-CNN model.They employed data augmentation and 5-fold cross-validation to boost the model's performance.
This section examined the performance of some deep learning models in classifying AD and is summarized in Table 1.

Subjects/datasets
The datasets were acquired from the AD Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu) [38].In 2003, the ADNI was established as a public-private collaboration with the primary objective of evaluating the feasibility of integrating MRI, PET, and other biological markers to investigate and curb the degeneration of MCI to AD.
This study extracted MRI images from a sample of 396 subjects diagnosed with AD, 255 subjects classified as NC, and 104 subjects with MCI from the ADNI database captured from 2005 to 2021 and within an age range of 55 to 97 years old.The MRI datasets were initially in DICOM format and were subsequently transformed into JPG format.The images collected for the experiment included 1,581 AD images, 1,310 MCI images, and 1,591 NC images.Each image was initially dimensioned as 256 x 256 pixels and later resized to 64 x 64 pixels.

Method
In this study, we sought to build a deep learning model with optimal performance to classify AD into three categories: MCI, AD, and NC.To achieve this, we proposed an approach to improve the quality of the image datasets during the data preprocessing stage, after which the data volume was increased using data augmentation.The data was then sampled using the Kfold cross-validation technique to ensure that all data samples were used in the training and testing.The method divides the datasets into K folds, of which one-fold is used for testing and the remaining for training.The technique assesses the model's capability to handle new examples, with K referring to the number of groups into which the data samples are divided.As presented in the subsequent section, a convolutional neural network was proposed, which was trained and evaluated iteratively using 10-fold cross-validation.For each iteration, the model is tested with the one-fold test data, after which the recorded test results are averaged.A detailed representation of the flow of activities in the proposed method is shown in Fig 1.
The experiment was implemented with an NVIDIA GeForce GTX 1060 Graphic Processing Unit (GPU) machine operating on CUDA 10.1 with 8 GB of memory allocation.The model was implemented in Python using the TensorFlow framework and other libraries such as Sklearn, Numpy, Keras, Pillow and Opencv-python.

Data preprocessing
Data preprocessing is an essential step in conducting a deep learning experiment, as the efficacy of the dataset can significantly influence the learning outcomes of the models.The MRI image datasets utilized in this study were initially transformed into JPG format to facilitate manual preprocessing of the datasets.MRI images that were corrupt or of insufficient quality were manually removed.Following manual preprocessing, it was observed that the dataset still  Algorithm 1 was designed to improve the contrast of an input image by performing histogram equalization on each color channel, followed by bilateral filtering on the equalized channels.The bilateral filtering aims to maintain the edges and details of the image.The filtered channels are recombined into a unified RGB image and subjected to normalization procedures to enhance overall contrast.Eq (1) defines bilateral filtering.
The equation for the image value at pixel position p is defined as I p , where G σ represents a 2D Gaussian kernel.The normalization factor, W p , is utilized to ensure that the pixel weights sum up to 1.0.The parameters s and r determine the extent of filtering applied to the image.Eq (3) defines the histogram equalization process, which is used to modify an image's intensities to improve its contrast.
where L is the number of possible intensity values, I n is the number of pixels with intensity, n and T p is the total number of pixels.

Convolutional neural network model
The

Performance evaluation
Various metrics were used to measure the model's performance during training and testing.These include recall, precision, accuracy and area under the curve (AUC).
The accuracy metric represents the aggregate count of accurately predicted labels by the model and is defined as Eq (5).
TN is the total number of true positives, TN is the true negatives, FP refers to the false positives, and FN is the false negatives.As shown in Eq (6), precision is the ratio of true positives to the total number of predictions made.
Recall calculates the true positive rates as presented in Eq (7).
The ROC curve's integral, AUC, corresponds to the likelihood of a positive sample being ranked at random higher than a negative sample.Given that the ROC curve is constructed through the successive joining of coordinates (x 1 , y 1 ), (x 2 , y 2 ), . .., (x m , y m ), the AUC is approximated using Eq (8).

Results
In this study, an algorithm for enhancing images was developed by integrating two image processing techniques: image equalization and bilateral filtering.The algorithm primarily enhances the image quality while preserving its edges and details.Bilateral filtering is employed to maintain the edges and details of the image, whereas histogram equalization adjusts the image's intensity levels to improve contrast.In order to ascertain the impact of the image transformation by the algorithm, image quality parameters such as mean pixel intensity, contrast, and entropy were used.The mean pixel intensity calculates the average of the pixel intensities across the image, and contrast evaluates the variability (standard deviation) of pixel intensities, contributing to the sharpness or clarity of the image.An increase in mean pixel intensity indicates that the image has become brighter, while a higher contrast suggests enhanced sharpness.Entropy measures the richness of information in the image's intensity distribution, with an increase in entropy indicating an improvement in image quality.The mean pixel intensity, contrast, and entropy values were calculated for both the original and enhanced images and visualized in The training and testing were done on both the original and enhanced datasets to aid in assessing the impact of the image transformation.The test outcomes are displayed in Table 2.
The model attained an accuracy of 90.34% on the original dataset, while on the transformed dataset, an accuracy of approximately 93.45% was recorded, accompanied by a standard deviation of 1.20%, suggesting the model's capacity to forecast the labels accurately.The precision of the model in identifying true positives was 93.70%, with a standard deviation of 1.21%.The recorded recall value is 93.24%, with a standard deviation of 1.25%.The model achieved a high  3.The test results show a consistent accuracy, precision, and recall score.The model's accuracy rates for classifying AD versus MCI, AD versus NC, and MCI versus NC stood at 94.92%, 94.39%, and 95.62%, respectively.Our proposed model demonstrated good performance, as evidenced by AUC scores of 0.98, 0.98, and 0.99 for  comparing AD versus MCI, AD versus NC, and MCI versus NC, respectively.The confusion matrix and the ROC curve for the binary classification are presented in Figs 6 and 7, respectively.
To adequately examine the model's performance, we conducted a comparative study with prevailing models employed for AD classification.The accuracy metric was the baseline metric for comparing this study to the existing models.The findings suggest the superior efficacy of our research in AD classification, as demonstrated in Table 4.

Discussion
The experimental results highlight the superior performance of the proposed method in relation to other methods discussed in the literature, especially in the binary and multiclass classification of AD.As evident from Table 4, the accuracy of the proposed model surpassed that of [6,20,22,32,33,37] in binary classification.In multiclass scenarios, our model outperformed the methods from [28,37] that considered the same number of classes as in this study.Conversely, methods [34,36] which encompassed more than the three classes in our study  exhibited higher accuracies than our proposed model.Thus, the findings suggest that our model stands out as the most effective for binary classification, as well as for multiclass classification, when specifically focusing on AD, MCI, and NC labels.

Conclusion
AD is a prevalent neurological disorder, predominantly affecting the elderly, and remains without a definitive cure.Early detection of AD is paramount to manage and potentially slow its progression from a mild stage to more severe stages.While deep learning offers promising avenues for detecting AD, it also presents specific challenges, with image quality being a prominent one.The study proposed and implemented an image quality scheme using histogram equalization and bilateral filtering to improve the dataset's quality, which was then used to train the proposed convolutional neural network.The proposed model consisted of four convolutional layers and two fully connected layers.It was trained using 10-fold cross-validation on three class labels: AD, MCI, and NC.The resulting accuracy, precision, recall score, and AUC score stood at 93.45%, 93.70%, 93.24%, and 0.99, respectively.These scores were comparatively higher than those of some selected state-of-the-art existing models.

1 . 1
contained noisy images with varying brightness levels, highlighting the necessity for additional preprocessing measures.An algorithm was proposed to address the issue of image noise and varying brightness.The algorithm was designed with histogram equalization and bilateral filtering, as presented under Algorithm Algorithm Image enhancement algorithm I RGBimage [I r , I g , I b ] r , I g , I b ) I normalizedImage(I)
Fig 4. The results indicate that the enhanced images showed more brightness, sharpness and richness of information since the mean pixel intensity, contrast, and entropy are higher.This study developed a CNN model to classify AD based on three distinct class labels: AD, MCI, and NC.The study further investigated the binary classification of AD versus MCI, AD versus NC, and MCI versus NC.The model was trained using a 10-fold cross-validation technique, with the learning rate hyperparameter set to 0.001, a batch size of 64, and 200 epochs.

Table 1 . Summary of existing CNN models.
https://doi.org/10.1371/journal.pone.0302358.t001 dimension.Two (2) hidden layers were incorporated into the model, wherein a batch normalization layer and a dropout layer succeeded every layer.The model generated 426,979 parameters, consisting of 425,187 trainable and 1,792 nontrainable.Additional hyperparameters include a learning rate of 0.001, a batch size of 64, and 200 training epochs.